Skip to content

Implement some string functions for libzigc #23847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

RcCreeperTech
Copy link

@RcCreeperTech RcCreeperTech commented May 10, 2025

Progress towards #2879.

  • Add memchr, memrchr, strchr, strchrnul, strrchr, index, rindex, memset, memccpy, memmem, mempcpy, stpcpy, stpncpy, strcat, strcpy, strncpy, strnlen
  • Remove memcmp and depend on compiler_rt implementation

Copy link
Contributor

@rpkak rpkak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about creating a PR with functions like this a few weeks ago but decided not to start working on it until #23538 is resolved because I don't want to make a nearly unnoticeable diversion between libzigc and libc. Since this PR exists now, I tested it against "libc-test/src/functional/string*.c" and found the following error:

src/functional/string_strchr.c:73: strchr(s,128) with align=0 returned 0, wanted str+127
src/functional/string_strchr.c:73: strchr(s,128) with align=1 returned 0, wanted str+127
src/functional/string_strchr.c:73: strchr(s,128) with align=2 returned 0, wanted str+127
src/functional/string_strchr.c:73: strchr(s,128) with align=3 returned 0, wanted str+127
src/functional/string_strchr.c:73: strchr(s,128) with align=4 returned 0, wanted str+127
src/functional/string_strchr.c:73: strchr(s,128) with align=5 returned 0, wanted str+127
src/functional/string_strchr.c:73: strchr(s,128) with align=6 returned 0, wanted str+127
src/functional/string_strchr.c:73: strchr(s,128) with align=7 returned 0, wanted str+127
src/functional/string_strchr.c:74: strchr(s,255) with align=0 returned 0, wanted str+254
src/functional/string_strchr.c:74: strchr(s,255) with align=1 returned 0, wanted str+254
src/functional/string_strchr.c:74: strchr(s,255) with align=2 returned 0, wanted str+254
src/functional/string_strchr.c:74: strchr(s,255) with align=3 returned 0, wanted str+254
src/functional/string_strchr.c:74: strchr(s,255) with align=4 returned 0, wanted str+254
src/functional/string_strchr.c:74: strchr(s,255) with align=5 returned 0, wanted str+254
src/functional/string_strchr.c:74: strchr(s,255) with align=6 returned 0, wanted str+254
src/functional/string_strchr.c:74: strchr(s,255) with align=7 returned 0, wanted str+254

It is caused by @as(c_char, @bitCast(@as(u8, 128))) == -128 != 128.

@alexrp
Copy link
Member

alexrp commented May 12, 2025

Has conflicts after #23835 due to files being moved; a rebase should be enough to deal with it automatically.

@alexrp
Copy link
Member

alexrp commented May 12, 2025

In hindsight, the way I set things up in lib/c.zig didn't really make sense:

zig/lib/c.zig

Lines 16 to 34 in 833d4c9

comptime {
if (builtin.target.isMuslLibC() or builtin.target.isWasiLibC()) {
// Files specific to musl and wasi-libc.
_ = @import("c/string.zig");
_ = @import("c/strings.zig");
}
if (builtin.target.isMuslLibC()) {
// Files specific to musl.
}
if (builtin.target.isWasiLibC()) {
// Files specific to wasi-libc.
}
if (builtin.target.isMinGW()) {
// Files specific to MinGW-w64.
}
}

I suggest you move those two @imports above any of the libc-specific blocks, and instead cover the @exports with appropriate builtin.target.is*LibC() checks in lib/c/string[s].zig. In particular, most of the exports aren't needed for MinGW - only mempcpy and strnlen at this time, AFAICT.

- Add memchr, memrchr, strchr, strchrnul, strrchr
- Remove memcmp and depend on compiler_rt implementation
Add index, rindex, memset
strrchr will return the address of the null terminator it c == 0
- add memccpy, memmem, mempcpy, stpcpy, stpncpy, strcat, strcpy, strncpy, strnlen
- Correct pointer types for alignment
- Use stdlib functions for `memchr` and `memrchr`
- Remove `mempcpy` and `strnlen` implementaitons from `lib/libc/mingw`
- Remove duplicate `__aeabi_mem*` symbols
- fix tests for `strchr` and `strrchr`
@alexrp
Copy link
Member

alexrp commented May 15, 2025

This'll need a rebase due to #23840 being merged, but feel free to wait until I actually review the function implementations here.

fn stpcpy(noalias dest: [*:0]c_char, noalias src: [*:0]const c_char) callconv(.c) [*:0]c_char {
var d: [*:0]c_char = dest;
var s: [*:0]const c_char = src;
// QUESTION: is std.mem.span -> @memset more efficient here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'd have to just measure this with a few different array sizes.

try std.testing.expect(strrchr(foo, 0) == (foo + 5));
}

fn __aeabi_memclr(dest: *anyopaque, n: usize) callconv(.c) *anyopaque {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks unused? We also have this in compiler-rt anyway.

}

fn memccpy(noalias dest: *anyopaque, noalias src: *const anyopaque, c: c_int, n: usize) callconv(.c) ?*anyopaque {
@setRuntimeSafety(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would this need to be different than the mode set for the compilation unit?

Comment on lines +163 to +164
const d: [*]c_char = @ptrCast(dest);
const s: [*]const c_char = @ptrCast(src);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to pointer cast; you can change the parameter types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale for this was that I wanted to keep the function signature as close to the c signature as possible. I was basing this decision off of a comment that you made in a prior post here. Technically in this case this would actually increase the type information, but it would not match 1:1 with the c ABI so I am unsure which approach is better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I forgot about that. If you don't mind, a comment would help remind future me, and future contributors as well, of the rationale behind this.

@setRuntimeSafety(false);
const d: [*]c_char = @ptrCast(dest);
const s: [*]const c_char = @ptrCast(src);
const needle: c_char = @intCast(c);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment explaining why the @intCast is justified (does the spec say what happens if you pass an int that does not fit into a char?)

Comment on lines +166 to +170
var idx: usize = 0;
while (idx < n) : (idx += 1) {
d[idx] = s[idx];
if (d[idx] == needle) return d + idx + 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like it could be implemented better using already existing std.mem functions, which are SIMD optimized.

const src: []const u8 = "supercalifragilisticexpialidocious";
var dst: [src.len]u8 = @splat(0);
const dOffset = std.mem.indexOfScalar(u8, src, 'd').?;
const endPtr: *anyopaque = @ptrCast(@as([*]u8, &dst) + dOffset + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to use @ptrCast in a unit test.

@andrewrk
Copy link
Member

andrewrk commented Jun 2, 2025

Since this is your first time contributing, I suggest to start with only one function. After you get the hang of it, it could make sense to make a larger PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants